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ABSTRACT 

AbaSI, a member of the PvuRtsll-family of 
modification-dependent restriction endonucleases, 
cleaves deoxyribonucleic acid (DNA) containing 
5-hydroxymethylctosine (5hmC) and glucosylated 
5hmC (g5hmC), but not DNA containing unmodified 
cytosine. AbaSI has been used as a tool for map- 
ping the genomic locations of 5hmC, an important 
epigenetic modification in the DNA of higher organ- 
isms. Here we report the crystal structures of AbaSI 
in the presence and absence of DNA. These struc- 
tures provide considerable, although incomplete, in- 
sight into how this enzyme acts. AbaSI appears 
to be mainly a homodimer in solution, but inter- 
acts with DNA in our structures as a homotetramer. 
Each AbaSI subunit comprises an N-terminal, Vsr- 
like, cleavage domain containing a single catalytic 
site, and a C-terminal, SRA-like, 5hmC-binding do- 
main. Two N-terminal helices mediate most of the ho- 
modimer interface. Dimerization brings together the 
two catalytic sites required for double-strand cleav- 
age, and separates the 5hmC binding-domains by 
~70 A, consistent with the known activity of AbaSI 
which cleaves DNA optimally between symmetrically 
modified cytosines ^22 bp apart. The eukaryotic 
SET and RING-associated (SRA) domains bind to 
DNA containing 5-methylcytosine (5mC) in the hemi- 
methylated CpG sequence. They make contacts in 
both the major and minor DNA grooves, and flip the 
modified cytosine out of the helix into a conserved 
binding pocket. In contrast, the SRA-like domain of 
AbaSI, which has no sequence specificity, contacts 
only the minor DNA groove, and in our current struc- 
tures the 5hmC remains intra-helical. A conserved, 
binding pocket is nevertheless present in this do- 



main, suitable for accommodating 5hmC and g5hmC. 
We consider it likely, therefore, that base-flipping is 
part of the recognition and cleavage mechanism of 
AbaSI, but that our structures represent an earlier, 
pre-flipped stage, prior to actual recognition. 

INTRODUCTION 

In the deoxyribonucleic acid (DNA) of higher organ- 
isms, cytosine occurs in several chemical forms, includ- 
ing unmodified cytosine (C), 5-methylcytosine (5mC), 
5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) 
and 5-carboxylcytosine (5caC) (1-7). These forms are ge- 
netically equivalent in terms of base-pairing and protein- 
coding, but they differ in how they interact with macro- 
molecules and influence gene expression. There is much 
interest in the effects of these modifications in epigenetic 
regulation, in development and differentiation, in neu- 
ron function and in disease. In general, the modifications 
(or 'marks') are added to cytosine in situ, following its 
incorporation into DNA in the unmodified form. DNA 
methyltransferases convert certain cytosines to 5mC, usu- 
ally within the sequence context CpG (8,9). And then ten- 
eleven translocation (Tet) dioxygenases convert a subset of 
these 5mC residues to 5hmC, 5fC and 5caC in consecutive, 
Fe(II)- and a-ketoglutarate-dependent oxidation reactions 
(10-13). The Tet dioxygenases are widely distributed across 
the eukaryotic tree of life (14), from mammals to the amoe- 
boflagellate Naegleria gruberi (15). 

To learn more about the functions of modified cytosines 
in the human genome, and about the mechanisms that con- 
trol their genetic locations and levels, methods are needed 
to distinguish the modifications individually, and to map 
their positions accurately. Newly discovered 'modification- 
dependent' restriction endonucleases such as MspJI are 
helping in this regard (16). These enzymes recognize 5mC 
and 5hmC in certain sequence contexts and cleave the DNA 
wherever these occur, generating genomic fragments that 
can be sequenced and analyzed by bioinformatics (17,18). 
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In common with the chemical method of bisulfite conver- 
sion (19), enzymes of the MspJI-family cannot distinguish 
between 5mC and 5hmC. Enzymes from a different group, 
the PvuRtslI-family, can make this distinction, however, 
and these offer a promising way to map 5hmC specifically. 
PvuRtslI, from the bacterium Proteus vulgaris, was identi- 
fied many years ago by its ability to restrict T-even bacterio- 
phages containing 5hmC and g5hmC in their DNA (20,21), 
but it aroused little interest until the recent re-discovery of 
5hmC in mammalian DNA (1,10). Since then, it has been 
purified and characterized (22,23) along with a number of 
homologs with similar, but subtly different, properties (24). 
One such homolog, AbaSI from Acinetobacter baumannii 
SDF, has been used successfully in conjunction with se- 
quencing (Aba-seq) to map the locations of 5hmC in mouse 
embryonic stem cells (7). 

AbaSI cleaves DNA containing g5hmC or 5hmC much 
more efficiently than DNA containing 5mC, by selectivity 
factors of 8000:500: 1 (23). It has negligible activity on DNA 
containing only C. AbaSI cleaves with some variability 3' to 
the modified cytosine, 1 1-13 nt away on the modified ('top') 
strand and 9-10 nt away on the complementary ('bottom') 
strand, producing fragments with short 3 / -overhangs (23). 
AbaSI has no recognition sequence ('context') specificity, 
but optimal cleavage occurs when two (g)5hmC residues 
occur 21-23 bp apart on opposite DNA strands, where- 
upon cleavage takes place mid-way between them. Cleav- 
age is less efficient if one of these two cytosines is unmod- 
ified, and much less efficient if the second cytosine is miss- 
ing altogether (24). To understand this spatial requirement, 
and to learn more about the mechanism of modification- 
dependent recognition, we determined the crystal structures 
of AbaSI with substrate DNA, and without DNA. We re- 
port the structures, here, together with insights into the ac- 
tion of AbaSI gained from comparisons with the DNA co- 
crystal structures of the UHRF1 SRA domain, and the Vsr 
mismatch-repair endonuclease. 

MATERIALS AND METHODS 

Protein expression and purification 

AbaSI from A. baumannii SDF, and originally designated 
AbaSDFI' (23), was expressed in Escherichia coli from a 
synthetic, codon-optimized, gene (Integrated DNA Tech- 
nologies or IDT) and purified as previously described 
(23,24). A chitin-binding domain-intein tag was fused at its 
C-terminus for affinity purification purposes, and for crys- 
tallography, three cysteine residues were changed to serine 
to reduce oligomerization (described below). Typically, 6L 
cultures were grown at 30°C to late log phase, whereupon 
expression was induced by the addition of isopropyl p-D- 
1-thiogalactopyranoside (IPTG) to 0.2 mM, and overnight 
incubation at 16°C. Cells were harvested by centrifuga- 
tion and lysed by French Press in 20 mM Tris-acetate (pH 
8.0) and 500 mM potassium acetate (lysis buffer), followed 
by centrifugation at 18000 rpm. The cleared extract was 
loaded onto a chitin column [~30 ml of chitin beads (NEB 
#S6651) were poured into a ~80 ml gravity-flow column] 
pre-equilibrated with lysis buffer. The column was washed 
with 10+ column volumes of lysis buffer until a coomassie- 
stained blot revealed little further protein eluting from the 



column. To induce intein-mediated cleavage, 50 ml of the 
lysis buffer containing 30 mM dithiothreitol (DTT) was 
added to the column and incubated at 4°C overnight. Lib- 
erated AbaSI was then eluted from the column with lysis 
buffer containing 5 mM DTT until a blot revealed little 
further protein being recovered. At this stage, AbaSI was 
-90% pure. 

Pooled protein was diluted 5-fold to ~100 mM potas- 
sium acetate in 20 mM Tris-acetate (pH 8.0), 5 mM DTT 
and loaded onto tandem HiTrap Q/Heparin columns (GE 
Healthcare). Most of the AbaSI flowed through the Q col- 
umn onto the Heparin column from which it was eluted 
as a single peak using a linear gradient of potassium ac- 
etate from 100 mM to 1 M. The position of the largest pro- 
tein peak in a Superdex 200 column (GE Healthcare) ap- 
peared to indicate that it was mainly a dimer, with some 
higher molecular weight oligomers (Supplementary Fig- 
ure SI a). This prompted us to create the variant AbaSI- 
C3S' by changing three cysteine residues at positions 2, 309 
and 321, to serine. Cys2 (the first amino acid) and Cys321 
(the last) are unique to AbaSI, whereas the equivalent of 
Cys309 in other family members is serine or threonine (Fig- 
ure la). AbaSI-C3S, expressed and purified as the native 
protein, chromatographed as a single peak on the sizing 
column (Supplementary Figure Sib), was enzymatically ac- 
tive (Supplementary Figure Sic), and was the form used for 
crystallization (Supplementary Figure Sid). 

Site-directed mutagenesis and activity assay 

Alanine substitution mutagenesis was carried out by poly- 
merase chain reaction (PCR) using vent polymerase (NEB 
#M0254) and pairs of synthetic mutagenic oligos (IDT). 
The PCR products were digested with Dpnl to reduce tem- 
plate DNA carry-over, and transformed into E. coli strain 
T7 Express (NEB #C2566). All constructs were sequenced 
to verify that the desired mutations were present, with no 
additional changes. 

Mutants were grown in 10 ml cultures and induced with 
IPTG, as described above. Crude cell-extracts were diluted 
1, 10 and 100-fold in 250 mM potassium acetate, 10 mM 
Tris-acetate, pH 8.0, 0.2 mg/ml BSA (NEB Diluent E) and 
incubated with 200 ng of (3-glucosylated phage T4 DNA 
in 50 mM potassium acetate, 20 mM Tris-acetate, 10 mM 
magnesium acetate, 1 mM DTT, pH 7.9 (NEB buffer 4) at 
room temperature (~25°C) for 20 min. The reaction prod- 
ucts were then electrophoresed in 0.8% agarose gels. We 
note that the mutations may also affect protein expression 
level and/or stability, resulting in a similar observation of 
diminished enzyme activity in our assays using cell lysates. 
Under our assay conditions, the nonspecific nuclease activ- 
ity of the extract from vector control (see Figure 2e, the last 
three lanes) is indistinguishable from that of the mutants 
with residual activity (<10%). 

Crystallography 

We crystallized AbaSI-C3S in the absence of DNA, in the 
presence of substrate DNA and in the presence of product 
DNA (Supplementary Table SI). Crystallization was car- 
ried out by the sitting-drop vapor-diffusion method at 16°C, 
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Figure 1. Monomeric structure of AbaSI. (a) Sequence alignment of AbaSI family members. The AbaSI residue numbering is shown above the sequence 
alignment. Amino acids highlighted are either invariant (white against black) among the seven proteins or similar (white against gray) as defined by the 
following groupings: V, L, I and M; F, Y and W; K and R, E and D; Q and N; E and Q; D and N; S and T and A, G and P. Helices are labeled aA-aG; 
strands are labeled |3 1— (3 16. Residues are indicated for forming the hydrophobic cores (see panels d and e), DNA phosphate binding (P), metal ion binding 
(M), structural turns (t) and surface exposed invariant residues (s). We note that residues 81-98 including strands (34 and (35 are unique to AbaSI, and 
helix aG is formed by the variable sequences. The first four N-terminal residues and the last three C-terminal residues (in gray) were not observed due 
to lack of continuous electron densities, (b and c) Two views of monomer structure of AbaSI, with N-terminal domain colored in turquoise, C-terminal 
domain colored in blue and the linker region in gray. The red triangle in panel b indicates a crevice formed between strands (32 and (33 where the putative 
cleavage active site is located (see Figure 2d), (d) Side chains from helices (aA, aB and aD) and strands ((31, (32, (33 and (36) (which are faded in the 
background) are buried in between forming a hydrophobic core of the N-terminal domain, (e) Side chains from helices (aE and aF) and strands (39, pi 3, 
(314 of the central curved sheet and strands (37 and (316 (which are faded in the background) are buried in between where they form a hydrophobic core of 
the C-terminal domain. In addition, four tryptophan residues of strand |39, pi 1, helix aG and strand (315 form a 4W pocket with additional contribution 
from hydrophobic residues from strands (39, (311, (312, (313 and (314 (see panel a). 



using equal amounts of protein (or protein-DNA mixtures) 
and well solutions. Crystals of protein alone (~20 mg ml -1 ) 
could be grown with 2 M ammonium sulfate and 100 mM 
HEPES (pH 7.0). 

For the protein in complex with substrate DNA, we 
started with a 28-bp double stranded oligonucleotide 
(oligo) — the minimum required length — containing one 
5hmC five bases in from the 5' end of the top strand and 
a second 5hmC (or C) 22-bp apart at position 28 on the 
opposite strand (Supplementary Figure S2a). This design 
was then lengthened one or two bp at a time to obtain oli- 
gos of 29-, 30- and 30-bp plus 5' -overhanging thymines (30 
+ 1), 32- and 32-bp plus 3 / -overhanging thymines (32 + 1) 
(Supplementary Figure S2b). We used protein dimer:DNA 
ratios of 0.5:1, 1:1 and 2:1, with oligo s of varying lengths. 
All combinations resulted in crystals in the same P2\ space 
group, with varied diffraction limits. The best crystals con- 
tained ~20 mg ml -1 protein, had a 1:1 ratio of dimeric pro- 
tein to 32-bp DNA and grew in 21-24% (w/v) polyethylene 
glycol 3350, 200 mM ammonium tartrate, 100 mM BisTris 
(pH 6.0-6.4), 5 mM calcium acetate. Although calcium ions 
were present in the crystallization medium, they were not 
observed in the endonuclease catalytic sites in the struc- 



tures, possibly due to chelation by the organic acids that 
were present. 

For the protein in complex with product DNA, we used a 
14-bp duplex oligo with a 4-nt, 3 / -overhang, to mimic cleav- 
age of the 32-bp oligo and dimer to DNA ratios of 0.5:1 
or 1:1. At the time, a 4-base overhang seemed reasonable; 
now we suspect that oligos with a 2-base overhang, and 
slightly closer 5hmC bases, might be more informative. Ex- 
periments are in progress to address this. The crystals ap- 
peared under the conditions of 16-22% (w/v) polyethylene 
glycol 3000 or 3350, 100 mM HEPES (pH 7.2-7.8), with 
or without 170-220 mM ammonium tartrate or 2% Tacsi- 
mate (a mixture of weak organic acid salts including tar- 
trate; Hampton Research). The crystallization conditions 
for the structure presented (Supplementary Table SI) were 
20% (w/v) polyethylene glycol 3350, 200 mM ammonium 
tartrate and 100 mM HEPES (pH 7.4). 

Selenomethionine (SeMet) was used for crystallographic 
phasing (25). Instead of Luria-Broth medium, AbaSI-C3S 
was expressed in E. coli BL21(DE3) utilizing M9 minimal 
medium (supplemented with glucose and the Difco yeast 
nitrogen base without amino acids and ammonium sulfate) 
where the L-amino acids Lys, Thr, Phe, Leu, lie and Val were 
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Figure 2. Dimeric structure of AbaSI. (a) Two AbaSI monomers (molecules A and B) form a dimer, mediated primarily by N-terminal helices (aA and 
aB). Labels 'N' and 'C indicate amino and carboxyl termini of each molecule, (b and c) Examples of interactions involved in dimer interface, (d) A model 
of N-terminal Vsr-like endonuclease domain in complex with metal ions. Superimposition of the AbaSI N-terminal endonuclease domain (in turquoise) 
and the Vsr-DNA-Mg 2+ complex (PDB: 1CW0) identified structural equivalent residues in AbaSI (in color) and in Vsr (in gray). The two Mg 2+ ions 
(shown as green balls and labeled as 'M') are near the proposed catalytic residue Asp61 of strand (32. Other invariant charged residues among the AbaSI 
family near the modeled metal ions are Lys23, Glu26, Glu72, Asp74 and Glu75. (e) Activities of WT, mutants (K23A, D61A, D74A, E75A, H77A and 
H78A) and vector control. Lane 0: 1-kb DNA marker (NEB); lanes 1, 2 and 3: dilutions of 1-, 10- and 100-fold of crude cell-extracts. 



added immediately before L-SeMet addition and IPTG in- 
duction. A single anomalous dispersion (SAD) data set 
was collected from a crystal of selenomethionyl AbaSI- 
C3S (containing four methionines per molecule) in com- 
plex with 32-bp substrate DNA. The AutoSol module of 
the PHENIX software (26) identified a total of 1 6 selenium 
atoms. One set of four selenium atoms could be related to 
three other sets of four atoms, indicating four monomers 
in the asymmetric unit. The resulting electron density for 
a-helices, (3 -sheets and molecular envelopes could be visu- 
alized, but side chains and connecting loops could not be 
easily identified. A second Se-SAD dataset showed better 
traceable density that allowed a monomer to be completely 
traced. Molecular replacement using this monomer located 
the other three monomers and the DNA in the asymmetric 
unit. 

All data sets were processed using the program HKL2000 

(27) . Phasing, map production, model refinement and 
molecular replacement were performed using PHENIX 

(28) . Maps and models were visualized with COOT (29), 
which was also used for manual model manipulation during 
refinement rounds. Individual crystallographic thermal B- 
factors were refined only at the end stage of refinement pro- 
cess. In addition, rigid-body motion of domains and inter- 
domain hinge motion, identified by the server of TLSMD 



(translation/libration/screw) (30,31), were also applied in 
the refinement. Molecular graphics were generated using 
PyMol (DeLano Scientific LLC). 

RESULTS AND DISCUSSION 

AbaSI forms a dimer 

We determined the crystal structure of AbaSI on its own, 
with substrate DNA oligonucleotides (oligos), and with a 
product oligo. The crystallographic asymmetric unit con- 
tained one dimer in the absence of DNA and two dimers in 
the presence of DNA. The overall structures of the dimers 
were closely similar in all crystal forms, with pairwise root- 
mean-square deviations of ~ 1.7 A across 627 pairs of Ca 
atoms. We describe below the general structural features of 
AbaSI based on the complex with product DNA, which is 
representative of all of the structures, and was obtained at 
the highest resolution of 2.9 A. 

Monomeric AbaSI (37.7 kDa) consists of seven helices 
(aA-aG) and 16 strands ((31-(316) (Figure la-c). Sequence 
alignment of characterized members of this family shows 
that AbaSI has an 18-residue insertion between strand (33 
and helix aC and five smaller insertions or deletions of four 
to eight residues, mostly in loops (Figure la). Sequence con- 
servation is scattered throughout the protein, and includes 
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residues involved in structural integrity, such as hydropho- 
bic cores and inter- or intra-domain interactions (Figure 
ld-e), as well as those with functional significance, such as 
DNA-binding, metal-ion coordination and catalysis. 

Two monomers, A and B, interact with an interface of 
~1030 A 2 primarily through their two N-terminal helices, 
with additional contributions from the amino portion of he- 
lix aD (Figure 2a). These interactions include ion pairs be- 
tween the side chains of Aspl 1 of helix aA and Arg24 (Fig- 
ure 2b); hydrogen bonds (H-bonds) involving polar residues 
within and flanking helix aB (Tyr28, Ser31, Arg32, His35 
and Asn38); and hydrophobic and aromatic residues of 
helix aA (Tyr22 and He 14), helix aB (Tyr28 and Leu36) 
and helix aD (Leul36) (Figure 2c). Consistent with the 
dimerization observed in the crystal, purified AbaSI eluted 
mainly as what appeared to be a dimer during size-exclusion 
chromatography, with some tetramer and higher oligomers 
(Supplementary Figure SI a). The dimension along the long 
axis of the dimer is ~ 100 A (Figure 2a), sufficient to span 
~30 bp in B-form DNA. The recently reported monomeric 
structure of PvuRtslI (32) has the same dimeric interface, 
mediated by the crystallographic symmetry (Supplemen- 
tary Figure S3). 

The N-terminal Vsr-like endonuclease domain 

The N-terminal dimerization helices aA and aB, together 
with aD, support and pack against one side of the four- 
stranded central (3-sheet of the N-terminal domain. Cur- 
vature of these strands, a mix of anti-parallel ((31 and (32) 
and parallel ((33 and (36) topologies, results in a crevice be- 
tween strands (32 and (33 where the putative catalytic site for 
DNA cleavage is located (Figure lb, indicated by a trian- 
gle). The large majority of the conserved hydrophobic side 
chains intercalate with each other at the interface of the a 
helices and the (3 strands to form the hydrophobic core of 
the N-terminal domain (Figure Id). Two small additional 
anti-parallel strands ((34 and (35) alongside strand (32 are 
unique to AbaSI due to the 18 -residue insertion (Figure la 
and b). 

VAST (vector alignment search tool) (33) and DALI (dis- 
tance matric alignment) searches (34) against structures in 
the Protein Data Bank (PDB) showed that parts of the 
AbaSI N-terminal domain superimpose well on the very 
short patch repair DNA-nicking endonuclease, Vsr (35,36), 
on an intron homing endonuclease, I-Bth0305I (37), on 
the cleavage domain of Type IIG restriction endonucle- 
ase, BpuSI (38) and on several uncharacterized 'restriction 
enzyme-like' proteins (Supplementary Figure S4). Struc- 
turally similar elements include helices aB and aD, strands 
(32 and (33 containing the catalytic residues, and strand 
(36 (Supplementary Figure S4). Among these matching en- 
donuclease structures, only Vsr has been crystallized with 
the essential catalytic co-factor, Mg 2+ and with substrate 
DNA, which is cleaved (36). 

Using the coordinates of the Vsr complex (PDB: 1CW0), 
we superimposed the proteins and positioned the Vsr DNA 
and Mg 2+ ions over the N-terminal domain of AbaSI. The 
superimposition showed that the catalytic site of AbaSI is 
an unusual variant of the PD-D/EXK endonuclease super- 
family (39-41) and perhaps also coordinates two Mg 2+ ions 



(Figure 2d). The side chain of Asp61 in (32 (=Vsr Asp51), 
the main chain carbonyl of Val73 in (33 (=Vsr Thr63) and 
the side chain Glu75 in the loop after (33 are positioned to 
coordinate Mg 2+ ions directly, as is customary in these cat- 
alytic sites. Mutation of Asp61 (D61A) and Glu75 (E75A) 
abolished activity (Figure 2e), as were also reported for the 
corresponding mutations (D57A and E71A) in PvuRtslI 
(32). In addition, the side chains of Glu72 and Asp74, con- 
served among the members of this enzyme family (Figure 
la) (23), might also participate in metal-ion coordination, 
but indirectly, via intermediate water molecules. Mutation 
of the latter residue (D74A) retained partial activity (Figure 
2e). 

Alternately, as occurs in BamHI (42), Asp74 or Glu75 
might act as the general base in the catalytic reaction, assist- 
ing in the creation of the hydroxide nucleophile needed for 
in-line attack on the phosphorus atom, a hypothesized role 
usually assigned to the lysine (K) of the PD-D/EXK motif 
(43). In Vsr, His64 or His69 are positioned to act as the gen- 
eral base rather than lysine; His77 or His78 of AbaSI might 
act in this way, too. Mutation of His78 (H78A) eliminated 
activity whereas H77A has the usual WT activity (Figure 
2e). Otherwise, Lys23, recruited from the loop preceding 
aB, is positioned to do this, instead. Lys23 also forms an 
ion bridge with Asp61 in the absence of metal ions (Figure 
2d). Mutation of Lys23 (K23A) abolished activity (Figure 
2e) as also did the corresponding lysine mutation (K17A) in 
PvuRtslI (32). Loss of activity in K23A and H78A mutants 
confirms the importance of these residues, both of which are 
highly conserved in members of this enzyme family. 

Several other invariant residues are located near the 
catalytic site. Glu26 of helix aB (=Vsr Glu25) forms an 
H-bond with Gln47 of (31 (Vsr Gln42). Next to Gln47, 
Gln48 interacts with the main chain carbonyl oxygen of 
Aspl 11, which in turn forms an inter-domain interaction 
with Arg286 of the C-terminal domain (Figure 2d). His78 
of AbaSI, located in the loop after strand (33, points away 
from the active site in the present model (Figure 2d). His69 
of Vsr, located in the corresponding loop, is essential for Vsr 
endonuclease activity (35) and H-bonds with the cleaved 
phosphate group (36). 

The DNA in the Vsr complex is significantly distorted 
(36). Phe67, Trp68, and His69 of Vsr penetrate the helix 
from the major groove and wedge apart adjacent base pairs 
by ~60°. The equivalent residues of AbaSI, His77, His78, 
and Phe79, are also planar and might act in a similar way. 
Vsr-DNA superimposed on the AbaSI N-terminal domain 
fits well on one side of the catalytic site, but due to the dis- 
tortion, poorly on the other. The loop preceding helix aD 
follows the major DNA groove, but the loop after strand 
(33 and the AbaSI-specific strands (34 and (35 do not; if the 
DNA were not distorted, these would occupy the minor 
groove. Vsr DNA superimposed on the N-terminal domain 
of one AbaSI subunit contacts the catalytic site of that sub- 
unit but not the catalytic site of other subunit in the homod- 
imer. This might indicate that double-strand cleavage takes 
place in two steps by sequential strand-nicking reactions, or 
that conformational changes occur upon binding of DNA 
and/ or metal ions that bring the components into proper 
register. 
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The C-terminal SRA-like DNA-binding domain 

VAST and DALI searches also revealed that the C-terminal 
domain of AbaSI is structurally similar to the SET and 
RING-finger associated (SRA) domains of Arabidopsis 
SUVH5 (44), human and mouse UHRF1 (45-47), and 
the N-terminal DNA-binding domain of MspJI (48) (Sup- 
plementary Figure S5). In AbaSI, this domain contains 
eight (3-strands (in the order 10, 11, 12, 15, 14, 13, 9 and 
8) that together roughly form one twisted (3 -sheet resem- 
bling an arch (the 'beta-arch'). Two long curved antipar- 
allel strands, 15-residue (314 (residues 281-295) and 10- 
residue (313 (residues 268-277), are largely responsible for 
this conformation (Figure lc and Supplementary Figure 
S5). Two short helices, aE and aF, pack on one side of this 
sheet against strands (38, (39, (313 and (314, with conserved 
hydrophobic residues in between (Figure lb and e). One 
longer helix, aG, surmounts the sheet, and provides Trp262, 
one of four tryptophan residues that make up a unique 4 4W 
pocket (Figure le) within the beta-arch that accommodates 
(g)5hmC, we speculate, when this is flipped out of the helix 
(see below). 

A rigid linker connects the two functional domains 

A 10-residue linker (residues 161-170) connects the N- 
terminal Vsr-like domain to the C-terminal SRA-like do- 
main. The linker contains three well-conserved hydropho- 
bic residues (Phel63, Trpl66 and He 168) that pack against 
the hydrophobic core of the N-terminal domain (Figure 
Id). Numerous additional inter-domain interactions ac- 
count for ~620 A 2 interface area. The extent and conser- 
vation of these interactions suggest that the overall AbaSI 
monomer has a rather stable structure that does not change 
greatly from what we observe in our crystal forms in the ab- 
sence and presence of DNA. 

DNA-protein interactions in the AbaSI co-crystals 

For co -crystallization with AbaSI, substrate oligos of 28- 
32 bp were used (Supplementary Figure S2). All crystal- 
lized in the same P2\ space group, with two AbaSI dimers 
and one DNA molecule in the crystallographic asymmetric 
unit. For product, a 14-bp oligo with complementary 4 nt, 
3 / -single-stranded ends was used. Two such molecules an- 
nealed via their ends in the product structure, forming a 32- 
bp duplex with one phosphodiester backbone break in each 
strand (Figure 3a and Supplementary Figure S6a). Regard- 
less of whether substrate DNA or product DNA was used, 
the 32-bp duplexes stacked head-to-tail, with one neighbor- 
ing DNA molecule at each end, forming a long helix, par- 
allel to the crystal 6-axis (Figure 4a). All four AbaSI sub- 
units contribute to DNA backbone phosphate interactions, 
each dimer spanning ~28 bp (Supplementary Figure S6b 
and c). The DNA in our co-crystals was aligned with the 
long axis of the AbaSI dimer, and the binding and catalytic 
domains were in the correct general locations for recogni- 
tion and cleavage, but intimate contact with the DNA was 
completely absent. 

The only direct base contact we observed between AbaSI 
and the DNA is mediated by Gln209 in the minor groove 



(Figure 3d). Gln209 of molecule A of the A-B dimer forms 
two H-bonds (via the amide group) with the modified base 
pair, one with the 02 atom of 5hmC and the other with 
the 2-amino group of its partner guanine (Figure 3e). The 
Gln209 side chain also makes an additional H-bond with 
the 02 atom of the thymine 5' to 5hmC. The phosphate- 
backbone contacts with molecule A are concentrated on 
four phosphate groups on each strand surrounding the 
5hmC:G pair (Supplementary Figure S6b). 

The side chain of the corresponding Gln209 of molecule 
B points toward the minor groove but is too far away to con- 
tact the second 5hmC 23 bp away at position 28. The corre- 
sponding phosphate contacts by molecule B are shifted 2-3 
bp to the y side of the second 5hmC as though the 5hmC 
residues were 2-3 bp too far apart (Supplementary Figure 
S6b). No major groove interactions with either 5hmC:G 
base pair is evident in the structures, suggesting that the 
modification status of the cytosine is not detected in the 
current crystal forms. Molecule A of the dimer interacts 
differently with the DNA than molecule B. Superimposing 
the protein components of molecules A and B, the corre- 
sponding bound DNA is misaligned, and must be rotated by 
-100° (Figure 3f), equivalent to 3 bp (360°/10.5 = 34°/bp), 
to coincide. Similarly, superimposing the DNA juxtaposed 
by molecules A and B requires that the latter be rotated by 
100° in order to superimpose (Figure 3g). This difference 
suggests that the AbaSI dimer aligns most closely with mod- 
ified cytosines that are only 19-20 bp apart. The DNA in 
the co-crystal structures is essentially straight. Bending at 
the center is needed for the DNA to contact the catalytic 
sites, some 15-20 A away, and this might change the opti- 
mal spacing between the modified cytosines. 

The second, C-D, dimer displays the same two glu- 
tamines in the minor groove of the DNA, separated by 
~22 bp in neighboring DNA molecules (Figure 3h and 
Supplementary Figure S6c). These do not juxtapose the 
modified cytosines, however. The side chain of Gln209 of 
molecule C interacts with the 02 atoms of adjacent thymine 
residues at positions 21 and 22 (Figure 3i), indicating that 
the Gln209-mediated interaction is not base-specific. De- 
spite its singular interaction with DNA, Gln209 appears to 
be non-essential, as a Gln209-to-alanine (Q209A) mutant 
was found to display full wild-type activity (Figure 3k). 

Dimer-dimer interactions 

Dimers A-B and C-D have few direct contacts. These 
are confined to helix aC-mediated interaction between 
molecules A and C (Figure 3b and c) and helix aG- 
mediated interaction between molecule B and molecule D 
of the neighboring C-D dimer (Figure 4b). Two invari- 
ant residues, Aspl05 and Argl08 (Figure la), are located 
in the helix aC and the following loop. Together with an- 
other acidic residue of helix aC, Glul03, which could form 
a potential ion bridge with Argl08, this charged surface 
appears to be important for catalysis as alanine-mutations 
of these three residues abolished (D105A), or severely re- 
duced (El 03 A and R108A), activity (Figure 3k). Muta- 
tion of three surface residues of helix aG (T253A, L259A 
and K263A) indicated that only Leu259 is essential (Figure 
4c). Neither the aC-mediated nor the aG-mediated interac- 
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Figure 3. AbaSI-DNA interactions, (a, b and c) The DNA molecule is encircled by two pairs of AbaSI dimers, the A-B dimer and the C-D dimer, as 
indicated by colors. Three views are related via ~90° rotations along a vertical axis as indicated. Two red circles in panels indicate the missing phosphate 
groups after annealing two product DNA molecules together by the 4-nt 3 f overhangs (Supplementary Figure S6a). (d) The A-B dimer approaches the 
DNA with Gln209 side chains (in space filling model) in the minor groove, (e) Gln209 of molecule A interacts with 5hmC:G pair from the minor groove, 
(f) Superimposition of molecules A (green) and B (blue) indicates relative rotation of the bound DNA molecules, (g) Superimposition of the corresponding 
DNA components indicates relative rotation between molecules A and B. (h) The C-D dimer, spanning two neighboring DNA molecules, has their Gln209 
side chain (in space filling model) in the minor groove, (i) Gln209 of molecule D interacts with two thymines of neighboring base pairs (Supplementary 
Figure S6c). (j) The helix aC-mediated dimer-dimer interaction, (k) Activities of mutants of E103A, D105A, R108A and Q209A. Lane 0: 1-kb DNA 
marker (NEB); lanes 1, 2 and 3 for WT and mutants: dilutions of 1-, 10- and 100-fold; lane 4: substrate of (3-glucosylated T4 DNA. 



tions were observed in the absence of DNA, suggesting that 
they arise only upon DNA-binding. In solution, during gel- 
filtration chromatography, a 1:1 dimer: DNA mixture was 
found to elute as two peaks, one protein plus DNA, the 
other free DNA (Figure 4d). In contrast, a 2:1 dimer:DNA 
mixture eluted as a single peak of protein plus DNA. This 
suggests that the two-dimer plus one DNA complex seen in 
the crystal asymmetric unit (Figure 4a), although not spe- 
cific, is consistent with the observation in solution under 
micromolar concentration of the complex. 

Similarities with UHRF1 SRA-DNA complex 

In the absence of a specific AbaSI-DNA recognition com- 
plex, we modeled 5hmC-containing DNA into the AbaSI 
C-terminal domain using the coordinates of the mouse 
UHRF1 (mUHRFl) SRA-DNA complex (PDB: 3FDE), 
in which the 5mC is extra-helical and flipped from the helix 
into a conserved binding pocket (47). The protein compo- 



nents were superimposed (Figure 5a) to position the DNA 
over the corresponding basic surface of AbaSI (Figure 5b 
and c) whereupon the flipped 5mC was found to occupy a 
cavity in the AbaSI C-terminal domain we term the '4W 
pocket' (Figure 5d and e). In the UHRF1-SRA pocket, 
main chain atoms and the side chains of Asp474, Tyr471 
and Tyr483 form the binding site for the methylated cyto- 
sine. Asp474 and Ala468 (main chain amide nitrogen, N), 
Gly470 (N) and Thr484 (main chain carbonyl oxygen, O), 
H-bond with the flipped base, compensating for the loss 
of the Watson-Crick H-bonds and the two tyrosine rings 
sandwich the base, compensating for the loss of aromatic 
base-pair stacking (Figure 5d and e). Comparable inter- 
actions are available for flipped 5(h)mC modeled into the 
AbaSI 4W pocket. The side chains of Asn236 (=Asp474), 
Glu247 (=Thr484 (O)), and the main chain of Arg227 (N) 
(=Ala468 (N)) are positioned to form H-bonds (Figure 5d), 
while Trp234 (=Tyr471) and perhaps Trp224 (=Tyr483) are 
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Figure 4. DNA-mediated dimer-dimer interaction, (a) The crystallographic asymmetric unit (indicated by a box) contains two dimers, A-B dimer and 
C-D dimer and one 32-bp DNA duplex. A second repeating unit along the 6-axis is shown, (b) The helix aG-mediated dimer-dimer interaction in the 
crystal, (c) Activities of mutants of T253A, L259A and K263A. Lane 0: 1-kb DNA marker (NEB); lanes 1, 2 and 3 for WT and mutants: dilutions of 1-, 
10- and 100-fold; lane 4: substrate of (3-glucosylated T4 DNA. (d) Elution profiles of four consecutive runs on a Superdex 200 (10/300 GL) column (GE 
Healthcare) with 20 mM Tris acetate (pH 8.0), 150 mM potassium acetate, 5 mM DTT and 5 mM calcium acetate and containing (from bottom to top 
panels) protein alone (~3 \xM in 200 |xl), DNA alone (~3 \xM in 200 jjlI; 29-bp duplex, see Supplementary Figure S2b), eqimolar AbaSI-C3S dimer and 
DNA duplex (~3 |jlM in 200 ui), 2:1 molar ratio of AbaSI-C3S dimer (~3 |xM) to DNA duplex (~1.5 |xM in 400 jjlI). Peak heights reflected relative OD 2 go 
absorbance. 



positioned for aromatic stacking (Figure 5e). All of these 
amino acids are highly conserved among the members of 
this enzyme family (Figure la). Mutations to alanine of 
AbaSI residues that form the 4W pocket either abolished 
(W234A, R269A and W304A) or impaired (N236A and 
W224A) activity (Figure 5f), attesting to their importance. 
Equivalent mutations of two of these residues in PvuRtslI 
(W215A and E228) did the same (32). 



Differences with UHRF1 SRA-DNA complex 

Three interesting differences distinguish the AbaSI C- 
terminal domain from other SRA-domain proteins. The 
first concerns residue 236 — asparagine in AbaSI (Asn236), 
but aspartate in UHRF1 (Asp474), MspJI (Asp 103) and 
AspBHI (Asp71). The side chain of this residue accepts 
one H-bond from the 4-amino group of the flipped cyto- 
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Figure 5. A model of AbaSI in complex with modified DNA. (a) Superimposition of the AbaSI C-terminal domain (in blue) with the SRA domain of 
mouse UHRF1 (in yellow; PDB: 3FDE). (b and c) Two 90° views of the DNA molecule observed in the current structure of AbaSI in comparison with the 
SRA-bound DNA, suggesting the AbaSI-bound DNA needs to move towards the 4W-binding pocket with a flipped cytosine. We note the long Loop-12G 
in mUHRFI (colored in magenta) is in the DNA major groove, (d and e) The flipped 5mC nucleotide can be docked into the binding pocket of AbaSI 
(colored in blue). The SRA residues of mUHRFI are in yellow and the long Loop-12G in magenta, (f) Activities of alanine mutants of residues that forms 
the 4W-pocket. Lane 0: 1-kb DNA marker (NEB); lanes 1, 2 and 3 for WT and mutants: dilutions of 1-, 10- and 100-fold; lane 4: substrate of (3-glucosylated 
T4 DNA. For control, we included the phosphate-interacting Arg282 (Supplementary Figure S6b). (g) A model of glucosylated 5hmC in the 4W binding 
pocket of AbaSI. 



sine and donates one H-bond to its N3 ring atom, much as 
occurs during normal Watson-Crick base-pairing with gua- 
nine. Asparagine can donate to the N3 atom via its amide 
nitrogen (-NH 2 ), but for aspartate to do so, its carboxy- 
late group must be in the protonated state (-COOH). This 
is surprising since the pH (7-8) at which these enzymes op- 
erate is well above the pK a (3.9) of aspartate. The same 
is true for the conserved 'motif V glutamate (ENV; pK a 
= 4.1) of the 5mC-methyltransferases (49-51), which like- 
wise donates an H-bond to the flipped substrate cytosine 
and then protonates it preparatory to methyl transfer. The 
equivalent residue in the catalytic site of thymidylate syn- 
thase, whose substrate is dUMP, is also asparagine (52). 
For asparagine to H-bond with uracil rather than cyto- 
sine, its side chain must have the opposite orientation, ro- 
tated by 180° via the side chain x 2 torsion angle. If the 
Asn236 side chain can adopt both orientations, we antici- 
pate that the 4W pocket of AbaSI might accommodate 5- 
hydroxymethyluracil (5hmU) and glucosyl-5hmU (base- J) 
in addition to modified cytosine. Preliminary data indicate 
that AbaSI is inactive on a 5hmU-containing DNA, how- 
ever (not shown). This suggests that the Asn236 side chain 
cannot rotate, and indeed, close inspection reveals that its 
orientation is probably fixed by an H-bond with the main- 
chain oxygen of Arg227 (Figure 5g). 

The second difference concerns the DNA-contact loops. 
Our model of the AbaSI C-terminal domain bound to the 
DNA specifically, derived from the mUHRFI SRA-DNA 
complex, indicates that AbaSI contains an equivalent mi- 
nor groove loop (Loop-F8), but lacks the long correspond- 
ing major groove loop (Loop-12G) used by mUHRFI to 
recognize the modified cytosine and the CpG sequence- 
context in which it occurs (Figure 5a). AbaSI Loop-F8 
(residues 201-211) contains Gln209, discussed previously. 



The corresponding minor groove loop of mUHRFI con- 
tains Val451, which occupies the space left behind by the 
flipped 5mC and His450, which interacts with the 5' base 
pair (47). The 24-residue major groove loop of mUHRFI 
contains Arg496, which recognizes the orphan guanine via 
side chain H-bonds with the guanine 06 and Nl atoms, 
and Asn494, which recognizes the cytosine of the adjacent 
GC base pair via a main-chain H-bond (Figure 5b) (47). 
Consistent with the lack of sequence- specificity of AbaSI, 
its corresponding Loop-12G is only four residues long, 
making it too short to reach the DNA and recognize any 
sequence-context. In our model, Gln209 might make up for 
the lack of major groove H-bonds to the orphan guanine, 
when the (g)5hmC is flipped, by H-bonding with the gua- 
nine from the minor groove, instead. The two -loop mecha- 
nism used by mUHRFI for substrate-recognition and base- 
flipping, in which the DNA is approached from opposite 
major and minor-groove directions, is also used by DNA 
5mC-methyltransferases (53-55), DNA 5mC-dioxygenases 
(15,56), and DNA repair enzymes (57) including thymine 
DNA glycosylase which excises 5caC (58-60), an oxidation 
product of 5mC (12,13). 

The third difference concerns the capacities of the 
binding-pockets. In mUHRFI, the methyl group of the 
flipped 5mC interacts with the Ca and C(3 atoms of Ser486 
at the beginning of the Loop-12G (Figure 5d). There is 
no comparable interaction in AbaSI because Loop-12G of 
AbaSI is smaller and farther from the cytosine. As a result, 
the AbaSI pocket can accommodate cytosines with larger 5- 
modifications such as glucosylation. We modeled glucosy- 
lated hydroxymethylcytosine (g5hmC) into the 4W-pocket 
(Figure 5g) using the 5mC ring of the SRA-DNA com- 
plex as the foundation. Rotating three torsion angles be- 
tween the cytosine ring and the glucosyl moiety allowed us 
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to generate several possible conformations. Figure 5g shows 
one such conformation, in which the nitrogen of the indole 
ring of Trp224 interacts with the cytosine 5 -methyl oxygen 
atom, and the guanidino group of Arg269 interacts with the 
glucose hydroxyl groups. In addition, Glu247, positioned 
alongside Asn236, could interact in several ways to stabi- 
lize the 4W pocket or the flipped nucleotide (Figure 5g). 



A plausible model of AbaSI: coupling base recognition and 
DNA cleavage 

A major difference between AbaSI and many other struc- 
turally characterized base-flipping enzymes is that AbaSI 
comprises two distinct domains — one for modified-base 
recognition, the other for DNA strand-cleavage. These 
likely communicate and cooperate in order to cleave DNA. 
The SRA-like, (g)5hmC-recognition domain comprises the 
C-terminus of AbaSI, and the Vsr-like endonuclease do- 
main comprises the N-terminus. The order of these domains 
is the reverse of that in Type IIS restriction enzymes such 
as Fokl (61), and in the modification-dependent restriction 
enzymes MspJI (48) and AspBHI (62). DNA-cleavage do- 
mains of restriction enzymes generally contain only one 
catalytic site, whereas two are required for duplex DNA 
cleavage. Usually, this shortfall is made up by dimerization 
(Fokl) or tetramerization (MspJI and AspBHI) which jux- 
taposes pairs of catalytic sites in various ways that match the 
twist and the opposed polarities of the two DNA strands. 

In the AbaSI-DNA complex reported here, the N- 
terminal domains are dimerized, but the two catalytic sites 
are too far from the DNA to cleave it, and too far apart to 
catalyze double-strand cleavage in unison. AbaSI is known 
to cleave with some variability, which might be a conse- 
quence of this geometric disparity. In PD-D/EXK catalytic 
sites, the principle Mg 2+ ion is typically ~3 A from the 
target phosphorus atom, coordinated to one of its non- 
bridging oxygen atoms. Modeling Mg 2+ ions from the Vsr- 
DNA complex into both catalytic sites of the AbaSI ho- 
modimer finds them to be ~27 A apart, a spacing appro- 
priate therefore for hydrolyzing phosphates that are ~21 A 
apart. In B-form DNA, phosphates separated by a 2-nt, 3'- 
stagger, the principle substrates of AbaSI-hydrolysis, are on 
the order of 14 A apart — ~7 A closer, that is, than the cat- 
alytic sites in the dimer observed here (Figure 3d) [see, for 
example, the co-crystal structure of Eco29kI (63)]. Confor- 
mational changes like those seen in the Vsr complex (36), 
such as DNA bending and unwinding, might be needed, 
then, to bring the necessary elements together for catalysis. 
These conformational changes could accompany incorpo- 
ration of Mg 2+ ions into the active sites. In the absence of 
Mg 2+ ions, AbaSI binds non-specifically to DNA contain- 
ing different target modifications with approximately equal 
affinity (^d = 1.5-2 jxM; Supplementary Figure Sle and f). 
It is possible that there is a temporal order for cleavage that 
proceeds by recognition of the target cytosine, flipping from 
DNA helix and capture by the 4W pocket, Mg 2+ -binding by 
the catalytic sites, and then double-strand DNA cleavage. 

For optimal cleavage, AbaSI prefers two target cytosines 
symmetrically positioned around the cleavage site. One 
must be modified at ring position 5 by a hydroxymethyl 



(5hm) or glucosylated 5hm-group, but the other can be 
modified or unmodified (23,24). The requirement of the sec- 
ond cytosine appears not be absolute, however. In the Aba- 
seq' mapping of genomic 5hmC in mouse embryonic stem 
cells (7), the largest set of AbaSI cleavage sites (42.3%) have 
one CG and one CH (H = A/C/T) on opposite sides of 
the cleavage site, but the second largest set (27.3%) has one 
CG on one side and no cytosine at all on the other side. 
Asymmetric binding by the AbaSI dimer, with molecule A 
approaching one of the two 5hmC sites and molecule B is 
further away from the second 5hmC (Figure 3f and g), might 
account for this relaxed requirement for a second cytosine. 

Diversity of restriction enzymes 

Restriction enzymes have proven invaluable as laboratory 
tools for analyzing DNA molecules and rearranging them. 
They occur naturally in bacteria and archaea, and come 
in numerous different forms (64), from simple monomers 
[e.g. Mspl (65)] and dimers [BamHI (66,67); PvuII (68)], to 
tetramers [NgoMIV (69); Sfil (70)], polymers [SgrAI (71)], 
and complex enzymes with allosteric regulatory domains 
[Nael (72,73); EcoRII (74)]. The proteins can comprise one 
domain [Hindlll (75)], two domains [Fokl (61,76)], three 
[Mmel (77)] or more [TstI (78)]. Some cleave DNA exclu- 
sively one strand at a time [HinPlI (79,80)], others cleave 
both strands at once [EcoRI (81)] and some even multiple 
strands at once [Bcgl (82)]. Most require one or two Mg 2+ 
ions (83,84), but a few require no metal-ions [Bfil (85)], and 
others can use an array of different metal ions [HpyAV (86)]. 
Some barely distort DNA when they bind [Bglll (87)], some 
distort it substantially [EcoRV (88)], and yet others distort 
it dramatically [Pad (89)]. All this variety reflects the amaz- 
ing biochemical dexterity of microbes. 

Most of the characterized restriction enzymes belong to 
the Type IF class and cleave unmodified DNA in which the 
bases are present in their ordinary, unaltered forms. Along- 
side these, we are learning, exists an alternative galaxy of 
enzymes with the opposite property of cleaving DNA only 
when it is modified. These 'Type IV (90) restriction enzymes 
recognize DNA in which adenine or cytosine bases are al- 
tered by the addition of methyl groups, or small chemical 
derivatives, in the major DNA groove (48,62). AbaSI be- 
longs to this latter group, about which little is yet known, 
including how diverse and numerous they are. Whereas 
Type II enzymes are used mainly for DNA cloning, Type 
IV enzymes are finding uses in epigenetic analysis. They 
cleave genomic DNA molecules into fragments that flank, 
or bracket, the sites of cytosine-modification, and that can 
be analyzed by sequencing and bioinformatics. Some suc- 
cesses in this regard have already been reported (7,17,18). 
The work described here contributes to our growing under- 
standing of these new enzymes, and of their utility for in- 
vestigating the epigenetic processes of higher organisms. 

ACCESSION NUMBERS 

The X-ray structures (coordinates and structure factor files) 
of AbaSI have been submitted to PDB under accession 
number 4PAR (protein-product DNA), 4PBA (protein- 
substrate DNA) and 4PBB (protein alone). 
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